-
-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use fastchat conversations template #578
Conversation
98cc645
to
4738fea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick look first.
b051681
to
2ba8830
Compare
654bbcc
to
243d9e1
Compare
@@ -443,6 +443,7 @@ datasets: | |||
data_files: # Optional[str] path to source data files | |||
shards: # Optional[int] number of shards to split data into | |||
name: # Optional[str] name of dataset configuration to load | |||
conversation: # Optional[str] fastchat conversation type, only used with type: sharegpt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conversation: # Optional[str] fastchat conversation type, only used with type: sharegpt | |
conversation: # Optional[str] fastchat conversation type, only used with type: sharegpt. See options: https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py |
import fastchat.conversation | ||
|
||
fastchat.conversation.Conversation.get_turns = get_turns | ||
fastchat.conversation.Conversation.get_prompt = get_prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask what changed to need this patch? I see a lot of similarity Conversation from fschat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Mostly the ability to reuse most of the predefined conversation types defined thwre
Oh, didn't see it got merged while I was checking it |
* use fastchat conversations template * require fastchat (fschat) pip install * handle roles dynamically from conversation * tweak fastchat conversation with a monkeypatch to get individual turns * fix up so it works with multiple conversation styles, and don't strip the turns * fix sharegpt fixture now that we're using a more correct tokenization * use a new prompter and support fastchat conversation type * use sharegpt from prompt strategies now * update docs, add chatml template * add a newline after im_end token * ensure we correctly set system message * update per PR feedback to handle deprecated sharegpt types * don't add duplicate wandb req * make sharegpt fields configurable from yml * llama2 fixes * don't fail fatally when turns are improper
* use fastchat conversations template * require fastchat (fschat) pip install * handle roles dynamically from conversation * tweak fastchat conversation with a monkeypatch to get individual turns * fix up so it works with multiple conversation styles, and don't strip the turns * fix sharegpt fixture now that we're using a more correct tokenization * use a new prompter and support fastchat conversation type * use sharegpt from prompt strategies now * update docs, add chatml template * add a newline after im_end token * ensure we correctly set system message * update per PR feedback to handle deprecated sharegpt types * don't add duplicate wandb req * make sharegpt fields configurable from yml * llama2 fixes * don't fail fatally when turns are improper
use the
vicuna_v1.1
Conversations template from the fastchat package. This will make it easier to extend additional conversation types since we can simply load an already registered conversation format template.